RGBA -> RGB default background color vs padding color

rdong8groq · May 30, 2025, 1:39am

Most image processors will convert RGBA images to RGB by somehow incorporating the alpha channel. With Pixtral, they use the alpha channel as weight for the RGB channels on a white background, ie. if you pass in an RGBA image, the “transparent” part ends up white.

However, the Pixtral image processor also uses the default pad function from the image_transforms.py file, which by default will pad with black pixels.

Doesn’t this mean you could end up with an image that has a white background with a black padded border? It doesn’t really seem right but I want to know if it is intentional.

John6666 · May 30, 2025, 3:17am

For example, the image dimension conversion and padding functions used in Hugging Face Transformers, such as Qwen 2 VL, seem to black out the alpha channel. This is probably not intentional. Or rather, it probably doesn’t have much meaning. It’s likely that the developers are just concerned with getting consistent results.
(I think it’s probably left as is because in larger models, whether the background is black or white doesn’t seem to have a significant impact on the results…)

Regarding Pixtral, it seems to have been white-filled from the first commit, so I wonder if the model was trained with that in mind. Since there doesn’t seem to be any discussion about it, I think you’d have to ask the committer on GitHub for clarification.

github.com/huggingface/transformers

Add support for Pixtral (#33449)

committed 10:28AM - 14 Sep 24 UTC

ArthurZucker

+2707 -2

* initial commit * gloups * updates * work * weights match * nits… * nits * updates to support the tokenizer :) * updates * Pixtral processor (#33454) * rough outline * Add in image break and end tokens * Fix * Udo some formatting changes * Set patch_size default * Fix * Fix token expansion * nit in conversion script * Fix image token list creation * done * add expected results * Process list of list of images (#33465) * updates * working image and processor * this is the expected format * some fixes * push current updated * working mult images! * add a small integration test * Uodate configuration docstring * Formatting * Config docstring fix * simplify model test * fixup modeling and etests * Return BatchMixFeature in image processor * fix some copies * update * nits * Update model docstring * Apply suggestions from code review * Fix up * updates * revert modeling changes * update * update * fix load safe * addd liscence * update * use pixel_values as required by the model * skip some tests and refactor * Add pixtral image processing tests (#33476) * Image processing tests * Add processing tests * woops * defaults reflect pixtral image processor * fixup post merge * images -> pixel values * oups sorry Mr docbuilder * isort * fix * fix processor tests * small fixes * nit * update * last nits * oups this was really breaking! * nits * is composition needs to be true --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

github.com/huggingface/transformers

src/transformers/image_transforms.py

main


      
              elif mode == PaddingMode.SYMMETRIC:
                  image = np.pad(image, padding, mode="symmetric")
              else:
                  raise ValueError(f"Invalid padding mode: {mode}")
          
              image = to_channel_dimension_format(image, data_format, input_data_format) if data_format is not None else image
              return image
          
          
          # TODO (Amy): Accept 1/3/4 channel numpy array as input and return np.array as default
          def convert_to_rgb(image: ImageInput) -> ImageInput:
              """
              Converts an image to RGB format. Only converts if the image is of type PIL.Image.Image, otherwise returns the image
              as is.
              Args:
                  image (Image):
                      The image to convert.
              """
              requires_backends(convert_to_rgb, ["vision"])
          
              if not isinstance(image, PIL.Image.Image):

Topic		Replies	Views
CLIPVisionModel Padding Problem 🤗Transformers	2	149	November 18, 2024
Modifying ViT to include 4th channel 🤗Transformers	2	494	October 15, 2024
ValueError: Unable to create tensor, activate padding with 'padding=True' Beginners	3	110	November 13, 2024
XLMProphetNet returning different results when using padding 🤗Transformers	1	242	June 16, 2023
Using Padding for ASR models 🤗Transformers	0	325	December 16, 2022

RGBA -> RGB default background color vs padding color

Related topics